An Approach to Extracting Knowledge from Legacy Documents

نویسنده

  • Richard Crowder
چکیده

Organisations are increasingly information intensive; hence providing access to data that is trapped in various proprietary forms including catalogues, databases, human resource systems and internally generated documents is now becoming a significant and challenging task. The authors have undertaken research into approaches to capture relevant knowledge from legacy documents. This is achieved by converting the legacy documents to XML, (eXtensible Markup Language), documents where the output is semantically tagged. Once in an XML form, the data can be easily transformed. This paper describes the development of tools to automate the process of converting legacy documents to XML documents. The purpose of this work is improve the efficiency and reliability of Expertise Finder suitable for use within an engineering design environment. We will also show that by querying the resultant XML versions of legacy documents provides better results than a basic text search over the identical documents when applied used within an Expertise Finder. Keyword: Knowledge management, Expertise finders, XML.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extracting Prior Knowledge from Data Distribution to Migrate from Blind to Semi-Supervised Clustering

Although many studies have been conducted to improve the clustering efficiency, most of the state-of-art schemes suffer from the lack of robustness and stability. This paper is aimed at proposing an efficient approach to elicit prior knowledge in terms of must-link and cannot-link from the estimated distribution of raw data in order to convert a blind clustering problem into a semi-supervised o...

متن کامل

ارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متن‌کاوی در حوزه یادگیری الکترونیکی

As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...

متن کامل

A Semantic Portal for Fund Finding in the EU: Semantic Upgrade, Integration and Publication of Heterogeneous Legacy Data

FundFinder is a Semantic Web portal that allows searching for and navigating through information about funding opportunities. This application has been created following a set of techniques and using a set of tools for the upgrade of legacy content to the Semantic Web, including databases and semistructured documents. This process consists in extracting and populating knowledge from heterogeneo...

متن کامل

Language Engineering for the Recovery of Requirements From Legacy Documents

Legacy documents, such as requirements documents or manuals of business procedures, can sometimes offer an important resource for informing what features of legacy software are redundant, need to be retained or can be reused. This situation is particularly acute where business change has resulted in the dissipation of human knowledge through staff turnover or redeployment. Exploiting legacy doc...

متن کامل

An Approach for Extracting the Main Characteristics of a Smart City

Nowadays, “Smart City” is a very growing and promoting concept. Unlike other concepts related to cities, like digital or green city, it encompasses all technological, human and institutional factors. However, there are too many different viewpoints and definitions of this concept in the literature. The problem statement of this research is that despite the existence of many points of view and d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004